Adaptive unequal weight planar prediction

ABSTRACT

A method of partitioning a video coding block for JVET, comprising representing a JVET coding tree unit as a root node in a quadtree plus binary tree (QTBT) structure that can have a quadtree branching from the root node and binary trees branching from each of the quadtree&#39;s leaf nodes using asymmetric binary partitioning to split a coding unit represented by a quadtree leaf node into two child nodes of unequal size, representing the two child nodes as leaf nodes in a binary tree branching from the quadtree leaf node and coding the child nodes represented by leaf nodes of the binary tree with JVET, wherein coding efficiency is improved by taking advantage of the similarity of coding modes 2 and 66.

CLAIM OF PRIORITY

This application claims priority to U.S. patent application Ser. No.16/753,269 filed Apr. 2, 2020, which claims priority toPCT/US2018/055099 filed Oct. 9, 2018, which claims the benefit of U.S.Provisional Application Ser. No. 62/569,868, filed Oct. 9, 2017, all ofwhich is hereby incorporated by reference.

TECHNICAL FIELD

The present disclosure relates to the field of video coding,particularly increased coding efficiency enabling higher bit-rates,resolutions and better quality video by reducing number of modes forencoding.

BACKGROUND

The technical improvements in evolving video coding standards illustratethe trend of increasing coding efficiency to enable higher bit-rates,higher resolutions, and better video quality. The Joint VideoExploration Team is developing a new video coding scheme referred to asJVET. Similar to other video coding schemes like HEVC (High EfficiencyVideo Coding), JVET is a block-based hybrid spatial and temporalpredictive coding scheme. However, relative to HEVC, JVET includes manymodifications to bitstream structure, syntax, constraints, and mappingfor the generation of decoded pictures. JVET has been implemented inJoint Exploration Model (JEM) encoders and decoders which utilizevarious coding techniques including weighted angular prediction.

In current JVET design, 67 angular coding modes are used to determinethe prediction CU. However, two of those coding modes (mode 2 and mode66) share a common angle. Accordingly, what is needed is a system andmethod of coding JVET that exploits the common angle of modes 2 and 66to reduce coding burden.

SUMMARY

A system of one or more computers can be configured to performparticular operations or actions by virtue of having software, firmware,hardware, or a combination of them installed on the system that inoperation causes or cause the system to perform the actions. One or morecomputer programs can be configured to perform particular operations oractions by virtue of including instructions that, when executed by dataprocessing apparatus, cause the apparatus to perform the actions. Onegeneral aspect includes defining a coding unit (CU) within a coding areaof a video frame having CU x and CU y coordinates and defining alsoincludes defining a main reference pixel within said coding area havingmain x and main y coordinates associated with said main reference. Thestep can also include defining a side reference pixel within said codingarea having side x and side y coordinates associated with said sidereference. The system and method can also include defining a set ofprediction modes and/or identifying two discrete prediction modes withinsaid set of prediction modes. Further, the system and method can alsoinclude selecting a prediction mode from said set of prediction modesand/or generating a prediction CU for said coding unit based at least inpart on a combination of said main reference pixel and said sidereference pixel. Additionally, the system and method can include a stepwhere said prediction CU for said coding unit is coded in the samemanner for each of said two discrete prediction modes where each of saidtwo discrete prediction modes is differentiated based at least in parton a prediction direction. Other embodiments of this aspect includecorresponding computer systems, apparatus, and computer programsrecorded on one or more computer storage devices, each configured toperform the actions of the methods.

Implementations can include one or more of the following features: Themethod of coding JVET video where the prediction direction is based uponone or more characteristics of said coding unit; the method of codingJVET video where said prediction CU is entropy coded; the method ofcoding JVET video where the prediction direction is based at least inpart on a width of said coding unit; and/or the method of coding JVETvideo where said prediction modes include modes of integer valuesbetween 0 and 66; and/or the method of coding JVET video where said twodiscrete prediction modes are mode 2 and mode 66. And in someembodiments, the method of coding JVET video can be based upon the stepwhere coding associated with prediction mode 2 includes: determining amain weight value associated with said main reference pixel, determininga side weight value associated with said side reference pixel, andgenerating a prediction CU for said coding unit based at least in parton a combination of said main reference pixel combined with said mainweight value and said side reference pixel combined with said sideweight value. Implementations of the described techniques may includehardware, a method or process, or computer software on acomputer-accessible medium.

BRIEF DESCRIPTION OF THE DRAWINGS

Further details of the present invention are explained with the help ofthe attached drawings in which:

FIG. 1 depicts division of a frame into a plurality of Coding Tree Units(CTUs).

FIG. 2 depicts an exemplary partitioning of a CTU into Coding Units(CUs) using quadtree partitioning and symmetric binary partitioning.

FIG. 3 depicts a quadtree plus binary tree (QTBT) representation of FIG.2's partitioning.

FIG. 4 depicts four possible types of asymmetric binary partitioning ofa CU into two smaller CUs.

FIG. 5 depicts an exemplary partitioning of a CTU into CUs usingquadtree partitioning, symmetric binary partitioning, and asymmetricbinary partitioning.

FIG. 6 depicts a QTBT representation of FIG. 5's partitioning.

FIGS. 7A and 7B depict simplified block diagrams for CU coding in a JVETencoder.

FIG. 8 depicts 67 possible intra prediction modes for luma components inJVET.

FIG. 9 depicts a simplified block diagram for CU decoding in a JVETencoder.

FIG. 10 depicts an embodiment of a method of CU coding in a JVETencoder.

FIG. 11 depicts a simplified block diagram for CU coding in a JVETencoder.

FIG. 12 depicts a simplified block diagram for CU decoding in a JVETdecoder.

FIG. 13 depicts a simplified block diagram of an increased efficiencycoding system and method.

FIG. 14 depicts a simplified block diagram for CU coding with increasedefficiency in a JVET encoder.

FIG. 15 depicts a simplified block diagram for CU decoding withincreased efficiency in a JVET decoder.

FIG. 16 depicts an embodiment of a computer system adapted and/orconfigured to process a method of CU coding.

FIG. 17 depicts an embodiment of a coder/decoder system for CUcoding/decoding in a JVET encoder/decoder.

DETAILED DESCRIPTION

FIG. 1 depicts division of a frame into a plurality of Coding Tree Units(CTUs) 100. A frame can be an image in a video sequence. A frame caninclude a matrix, or set of matrices, with pixel values representingintensity measures in the image. Thus, a set of these matrices cangenerate a video sequence. Pixel values can be defined to representcolor and brightness in full color video coding, where pixels aredivided into three channels. For example, in a YCbCr color space pixelscan have a luma value, Y, that represents gray level intensity in theimage, and two chrominance values, Cb and Cr, that represent the extentto which color differs from gray to blue and red. In other embodiments,pixel values can be represented with values in different color spaces ormodels. The resolution of the video can determine the number of pixelsin a frame. A higher resolution can mean more pixels and a betterdefinition of the image, but can also lead to higher bandwidth, storage,and transmission requirements.

Frames of a video sequence can be encoded and decoded using JVET. JVETis a video coding scheme being developed by the Joint Video ExplorationTeam. Versions of JVET have been implemented in JEM (Joint ExplorationModel) encoders and decoders. Similar to other video coding schemes likeHEVC (High Efficiency Video Coding), JVET is a block-based hybridspatial and temporal predictive coding scheme. During coding with JVET,a frame is first divided into square blocks called CTUs 100, as shown inFIG. 1. For example, CTUs 100 can be blocks of 128×128 pixels.

FIG. 2 depicts an exemplary partitioning of a CTU 100 into CUs 102. EachCTU 100 in a frame can be partitioned into one or more CUs (CodingUnits) 102. CUs 102 can be used for prediction and transform asdescribed below. Unlike HEVC, in JVET the CUs 102 can be rectangular orsquare, and can be coded without further partitioning into predictionunits or transform units. The CUs 102 can be as large as their root CTUs100, or be smaller subdivisions of a root CTU 100 as small as 4×4blocks.

In JVET, a CTU 100 can be partitioned into CUs 102 according to aquadtree plus binary tree (QTBT) scheme in which the CTU 100 can berecursively split into square blocks according to a quadtree, and thosesquare blocks can then be recursively split horizontally or verticallyaccording to binary trees. Parameters can be set to control splittingaccording to the QTBT, such as the CTU size, the minimum sizes for thequadtree and binary tree leaf nodes, the maximum size for the binarytree root node, and the maximum depth for the binary trees.

In some embodiments JVET can limit binary partitioning in the binarytree portion of a QTBT to symmetric partitioning, in which blocks can bedivided in half either vertically or horizontally along a midline.

By way of a non-limiting example, FIG. 2 shows a CTU 100 partitionedinto CUs 102, with solid lines indicating quadtree splitting and dashedlines indicating symmetric binary tree splitting. As illustrated, thebinary splitting allows symmetric horizontal splitting and verticalsplitting to define the structure of the CTU and its subdivision intoCUs.

FIG. 3 shows a QTBT representation of FIG. 2's partitioning. A quadtreeroot node represents the CTU 100, with each child node in the quadtreeportion representing one of four square blocks split from a parentsquare block. The square blocks represented by the quadtree leaf nodescan then be divided symmetrically zero or more times using binary trees,with the quadtree leaf nodes being root nodes of the binary trees. Ateach level of the binary tree portion, a block can be dividedsymmetrically, either vertically or horizontally. A flag set to “0”indicates that the block is symmetrically split horizontally, while aflag set to “1” indicates that the block is symmetrically splitvertically.

In other embodiments JVET can allow either symmetric binary partitioningor asymmetric binary partitioning in the binary tree portion of a QTBT.Asymmetrical motion partitioning (AMP) was allowed in a differentcontext in HEVC when partitioning prediction units (PUs). However, forpartitioning CUs 102 in JVET according to a QTBT structure, asymmetricbinary partitioning can lead to improved partitioning relative tosymmetric binary partitioning when correlated areas of a CU 102 are notpositioned on either side of a midline running through the center of theCU 102. By way of a non-limiting example, when a CU 102 depicts oneobject proximate to the CU's center and another object at the side ofthe CU 102, the CU 102 can be asymmetrically partitioned to put eachobject in separate smaller CUs 102 of different sizes.

FIG. 4 depicts four possible types of asymmetric binary partitioning inwhich a CU 102 is split into two smaller CU 102 along a line runningacross the length or height of the CU 102, such that one of the smallerCUs 102 is 25% of the size of the parent CU 102 and the other is 75% ofthe size of the parent CU 102. The four types of asymmetric binarypartitioning shown in FIG. 4 allow a CU 102 to be split along a line 25%of the way from the left side of the CU 102, 25% of the way from theright side of the CU 102, 25% of the way from the top of the CU 102, or25% of the way from the bottom of the CU 102. In alternate embodimentsan asymmetric partitioning line at which a CU 102 is split can bepositioned at any other position such the CU 102 is not dividedsymmetrically in half.

FIG. 5 depicts a non-limiting example of a CTU 100 partitioned into CUs102 using a scheme that allows both symmetric binary partitioning andasymmetric binary partitioning in the binary tree portion of a QTBT. InFIG. 5, dashed lines show asymmetric binary partitioning lines, in whicha parent CU 102 was split using one of the partitioning types shown inFIG. 4.

FIG. 6 shows a QTBT representation of FIG. 5's partitioning. In FIG. 6,two solid lines extending from a node indicates symmetric partitioningin the binary tree portion of a QTBT, while two dashed lines extendingfrom a node indicates asymmetric partitioning in the binary treeportion.

Syntax can be coded in the bitstream that indicates how a CTU 100 waspartitioned into CUs 102. By way of a non-limiting example, syntax canbe coded in the bitstream that indicates which nodes were split withquadtree partitioning, which were split with symmetric binarypartitioning, and which were split with asymmetric binary partitioning.Similarly, syntax can be coded in the bitstream for nodes split withasymmetric binary partitioning that indicates which type of asymmetricbinary partitioning was used, such as one of the four types shown inFIG. 4.

In some embodiments the use of asymmetric partitioning can be limited tosplitting CUs 102 at the leaf nodes of the quadtree portion of a QTBT.In these embodiments, CUs 102 at child nodes that were split from aparent node using quadtree partitioning in the quadtree portion can befinal CUs 102, or they can be further split using quadtree partitioning,symmetric binary partitioning, or asymmetric binary partitioning. Childnodes in the binary tree portion that were split using symmetric binarypartitioning can be final CUs 102, or they can be further splitrecursively one or more times using symmetric binary partitioning only.Child nodes in the binary tree portion that were split from a QT leafnode using asymmetric binary partitioning can be final CUs 102, with nofurther splitting permitted.

In these embodiments, limiting the use of asymmetric partitioning tosplitting quadtree leaf nodes can reduce search complexity and/or limitoverhead bits. Because only quadtree leaf nodes can be split withasymmetric partitioning, the use of asymmetric partitioning can directlyindicate the end of a branch of the QT portion without other syntax orfurther signaling. Similarly, because asymmetrically partitioned nodescannot be split further, the use of asymmetric partitioning on a nodecan also directly indicate that its asymmetrically partitioned childnodes are final CUs 102 without other syntax or further signaling.

In alternate embodiments, such as when limiting search complexity and/orlimiting the number of overhead bits is less of a concern, asymmetricpartitioning can be used to split nodes generated with quadtreepartitioning, symmetric binary partitioning, and/or asymmetric binarypartitioning.

After quadtree splitting and binary tree splitting using either QTBTstructure described above, the blocks represented by the QTBT's leafnodes represent the final CUs 102 to be coded, such as coding usinginter prediction or intra prediction. For slices or full frames codedwith inter prediction, different partitioning structures can be used forluma and chroma components. For example, for an inter slice a CU 102 canhave Coding Blocks (CBs) for different color components, such as such asone luma CB and two chroma CBs. For slices or full frames coded withintra prediction, the partitioning structure can be the same for lumaand chroma components.

In alternate embodiments WET can use a two-level coding block structureas an alternative to, or extension of, the QTBT partitioning describedabove. In the two-level coding block structure, a CTU 100 can first bepartitioned at a high level into base units (BUs). The BUs can then bepartitioned at a low level into operating units (OUs).

In embodiments employing the two-level coding block structure, at thehigh level a CTU 100 can be partitioned into BUs according to one of theQTBT structures described above, or according to a quadtree (QT)structure such as the one used in HEVC in which blocks can only be splitinto four equally sized sub-blocks. By way of a non-limiting example, aCTU 102 can be partitioned into BUs according to the QTBT structuredescribed above with respect to FIGS. 5-6, such that leaf nodes in thequadtree portion can be split using quadtree partitioning, symmetricbinary partitioning, or asymmetric binary partitioning. In this example,the final leaf nodes of the QTBT can be BUs instead of CUs.

At the lower level in the two-level coding block structure, each BUpartitioned from the CTU 100 can be further partitioned into one or moreOUs. In some embodiments, when the BU is square, it can be split intoOUs using quadtree partitioning or binary partitioning, such assymmetric or asymmetric binary partitioning. However, when the BU is notsquare, it can be split into OUs using binary partitioning only.Limiting the type of partitioning that can be used for non-square BUscan limit the number of bits used to signal the type of partitioningused to generate BUs.

Although the discussion below describes coding CUs 102, BUs and OUs canbe coded instead of CUs 102 in embodiments that use the two-level codingblock structure. By way of a non-limiting examples, BUs can be used forhigher level coding operations such as intra prediction or interprediction, while the smaller OUs can be used for lower level codingoperations such as transforms and generating transform coefficients.Accordingly, syntax for be coded for BUs that indicate whether they arecoded with intra prediction or inter prediction, or informationidentifying particular intra prediction modes or motion vectors used tocode the BUs. Similarly, syntax for OUs can identify particulartransform operations or quantized transform coefficients used to codethe OUs.

FIG. 7A depicts a simplified block diagram for CU coding in a JVETencoder. The main stages of video coding include partitioning toidentify CUs 102 as described above, followed by encoding CUs 102 usingprediction at 704 or 706, generation of a residual CU 710 at 708,transformation at 712, quantization at 716, and entropy coding at 720.The encoder and encoding process illustrated in FIG. 7A also includes adecoding process that is described in more detail below.

Given a current CU 102, the encoder can obtain a prediction CU 702either spatially using intra prediction at 704 or temporally using interprediction at 706. The basic idea of prediction coding is to transmit adifferential, or residual, signal between the original signal and aprediction for the original signal. At the receiver side, the originalsignal can be reconstructed by adding the residual and the prediction,as will be described below. Because the differential signal has a lowercorrelation than the original signal, fewer bits are needed for itstransmission.

A slice, such as an entire picture or a portion of a picture, codedentirely with intra-predicted CUs 102 can be an I slice that can bedecoded without reference to other slices, and as such can be a possiblepoint where decoding can begin. A slice coded with at least someinter-predicted CUs can be a predictive (P) or bi-predictive (B) slicethat can be decoded based on one or more reference pictures. P slicesmay use intra-prediction and inter-prediction with previously codedslices. For example, P slices may be compressed further than theI-slices by the use of inter-prediction, but need the coding of apreviously coded slice to code them. B slices can use data from previousand/or subsequent slices for its coding, using intra-prediction orinter-prediction using an interpolated prediction from two differentframes, thus increasing the accuracy of the motion estimation process.In some cases P slices and B slices can also or alternately be encodedusing intra block copy, in which data from other portions of the sameslice is used.

As will be discussed below, intra prediction or inter prediction can beperformed based on reconstructed CUs 734 from previously coded CUs 102,such as neighboring CUs 102 or CUs 102 in reference pictures.

When a CU 102 is coded spatially with intra prediction at 704, an intraprediction mode can be found that best predicts pixel values of the CU102 based on samples from neighboring CUs 102 in the picture.

When coding a CU's luma component, the encoder can generate a list ofcandidate intra prediction modes. While HEVC had 35 possible intraprediction modes for luma components, in JVET there are 67 possibleintra prediction modes for luma components. These include a planar modethat uses a three dimensional plane of values generated from neighboringpixels, a DC mode that uses values averaged from neighboring pixels, andthe 65 directional modes shown in FIG. 8 that use values copied fromneighboring pixels along the indicated directions.

When generating a list of candidate intra prediction modes for a CU'sluma component, the number of candidate modes on the list can depend onthe CU's size. The candidate list can include: a subset of HEVC's 35modes with the lowest SATD (Sum of Absolute Transform Difference) costs;new directional modes added for JVET that neighbor the candidates foundfrom the HEVC modes; and modes from a set of six most probable modes(MPMs) for the CU 102 that are identified based on intra predictionmodes used for previously coded neighboring blocks as well as a list ofdefault modes.

When coding a CU's chroma components, a list of candidate intraprediction modes can also be generated. The list of candidate modes caninclude modes generated with cross-component linear model projectionfrom luma samples, intra prediction modes found for luma CB s inparticular collocated positions in the chroma block, and chromaprediction modes previously found for neighboring blocks. The encodercan find the candidate modes on the lists with the lowest ratedistortion costs, and use those intra prediction modes when coding theCU's luma and chroma components. Syntax can be coded in the bitstreamthat indicates the intra prediction modes used to code each CU 102.

After the best intra prediction modes for a CU 102 have been selected,the encoder can generate a prediction CU 402 using those modes. When theselected modes are directional modes, a 4-tap filter can be used toimprove the directional accuracy. Columns or rows at the top or leftside of the prediction block can be adjusted with boundary predictionfilters, such as 2-tap or 3-tap filters.

The prediction CU 702 can be smoothed further with a position dependentintra prediction combination (PDPC) process that adjusts a prediction CU702 generated based on filtered samples of neighboring blocks usingunfiltered samples of neighboring blocks, or adaptive reference samplesmoothing using 3-tap or 5-tap low pass filters to process referencesamples in step 705 b. In some embodiments, PDPC can be accomplished inaccordance with the following Equation (1):

P′[x,y]=((A*Recon[x,−1]−B*Recon[−1,−1]+C*Recon[−1,y]+D*P[x,y]+Round)/Denom  Equation (1)

where A=(Cv1>>int(y/dy)), B=((Cv2>>int(y/dy))+(Ch2>>int(x/dx))),C=(Ch1>>int(x/dx)), and D=(1<<Denom)−A−C+B. Such that P′ [x,y] is afiltered pixel after post-filtering operation at coordinate (x,y) of thecurrent CU. Cv1, Cv2, Ch1, Ch2 are PDPC parameters determining filteringeffect and ‘Round’ is a rounding parameter and ‘Denom’ is anormalization factor.

In some embodiments, weighted angular prediction, can be employed whichgenerates predictor pixels for angular prediction using pixels atprojected positions on both a top reference row and a left referencecolumn. In embodiments employing weighted angular prediction, theprediction generation can be done in three steps—main referenceprojected prediction, side reference projected prediction andcombination of the projected predictions.

In some embodiments employing weighted angular prediction, the systemand method can project a pixel position along a main reference accordingto an angular direction definition of the coding intra prediction modeand determine a pixel value of the projected position using linearinterpolation between two neighboring reconstructed pixels. The systemand method can also project a pixel position along a side referenceaccording to the angular definition of the same coding mode anddetermine a pixel value of the projected position using linearinterpolation between two neighboring reconstructed pixels. Then thesystem and method can combine the projected pixel value of the mainreference with the projected pixel value of the side reference. Anon-limiting exemplary combination is shown below in Equation (2). Inthe exemplary combination shows in Equation (2) the values are weightedaccording to the distances between the predictor pixels and projectedpixel positions on the main and side references. However, in alternateembodiments alternate values can be used to weight the values associatedwith the main and side reference pixels.

P[x,y]=(((w1*MainRecon[x′,y′])+(w2*SideRecon[x″,y″])+(w1+w2)/2)/(w1+w2))  Equation (2)

In exemplary Equation (2) above, MainRecon[x′,y′] is a pixel value ofneighbor at projected position (x′,y′), corresponding to the predictingpixel (x,y), along the main reference. SideRecon[x″,y″] is a pixel valueof neighbor at projected position (x″,y″), corresponding to thepredicting pixel (x,y), along the side reference.

Equation (3) below shows a non-limiting exemplary combination usingweighted angular prediction using HEVC mode 2 or mode 66, and apredictor pixel at coordinate (x,y). Accordingly, P[x,y], would bedetermined as shown and described in Equation (3), in which Recon[0,0]is a reconstructed pixel at top left coordinate (0,0) of the current CU.

P[x,y]=((((x+1)*Recon[x+y+2,−1])+((y+1)*(Recon[−1,x+y+2]))+(y+x+2)/2)/(y+x+2))  Equation (3)

An exception to the system and process in which weighted angularprediction might not be employed can occur when a projected referenceposition on the side reference refers to a reconstructed position thatis not a viable position or is not available. In such instances whenweighted angular prediction may not be employed, multiple options arepossible to handle the exception. In some embodiments, the exception canbe handled by using the value of last available reconstructed pixel or adefault value for a projected position. In other alternate embodiments,the exception can be handled by disabling weighted angular predictionand/or using a projected pixel position of the main reference only.Thus, in step 705 a, it can be determined whether weighted angularprediction has been employed as the intra prediction mode in step 704.If in step 705 a, the intra prediction mode is determined as usingweighted angular prediction, then the prediction coding unit 702 can bedelivered for entropy coding absent filtering. However, if in step 705a, the intra prediction mode is determined to be other than weightedangular prediction, post intra prediction filtering 705 b, such as PDPCfiltering can be applied to the prediction coding unit prior to deliveryfor entropy coding.

As depicted in FIG. 7B, in some embodiments, a post intra predictionfilter 705 b can be employed after step 704 for all intra predictions.In such embodiments depicted in FIG. 7B, if the intra prediction mode isbased upon other than weighted angular prediction, then the filterapplied can be applied as it would normally be applied in step 705 b.However, if the intra prediction mode is based upon weighted angularprediction filtering in step 705 b can be bypassed and/or in someembodiments, the filter applied can be unbiased toward the mainreference, side reference or main and side references. By way onnon-limiting example, the values of Cv1 and Ch1 can be equal and/or thevalues of Cv2 and Ch2 can be equal.

When a CU 102 is coded temporally with inter prediction at 706, a set ofmotion vectors (MVs) can be found that points to samples in referencepictures that best predict pixel values of the CU 102. Inter predictionexploits temporal redundancy between slices by representing adisplacement of a block of pixels in a slice. The displacement isdetermined according to the value of pixels in previous or followingslices through a process called motion compensation. Motion vectors andassociated reference indices that indicate pixel displacement relativeto a particular reference picture can be provided in the bitstream to adecoder, along with the residual between the original pixels and themotion compensated pixels. The decoder can use the residual and signaledmotion vectors and reference indices to reconstruct a block of pixels ina reconstructed slice.

In JVET, motion vector accuracy can be stored at 1/16 pel, and thedifference between a motion vector and a CU's predicted motion vectorcan be coded with either quarter-pel resolution or integer-pelresolution.

In JVET motion vectors can be found for multiple sub-CUs within a CU102, using techniques such as advanced temporal motion vector prediction(ATMVP), spatial-temporal motion vector prediction (STMVP), affinemotion compensation prediction, pattern matched motion vector derivation(PMMVD), and/or bi-directional optical flow (BIO).

Using ATMVP, the encoder can find a temporal vector for the CU 102 thatpoints to a corresponding block in a reference picture. The temporalvector can be found based on motion vectors and reference pictures foundfor previously coded neighboring CUs 102. Using the reference blockpointed to by a temporal vector for the entire CU 102, a motion vectorcan be found for each sub-CU within the CU 102.

STMVP can find motion vectors for sub-CUs by scaling and averagingmotion vectors found for neighboring blocks previously coded with interprediction, together with a temporal vector.

Affine motion compensation prediction can be used to predict a field ofmotion vectors for each sub-CU in a block, based on two control motionvectors found for the top corners of the block. For example, motionvectors for sub-CUs can be derived based on top corner motion vectorsfound for each 4×4 block within the CU 102.

PMMVD can find an initial motion vector for the current CU 102 usingbilateral matching or template matching. Bilateral matching can look atthe current CU 102 and reference blocks in two different referencepictures along a motion trajectory, while template matching can look atcorresponding blocks in the current CU 102 and a reference pictureidentified by a template. The initial motion vector found for the CU 102can then be refined individually for each sub-CU.

BIO can be used when inter prediction is performed with bi-predictionbased on earlier and later reference pictures, and allows motion vectorsto be found for sub-CUs based on the gradient of the difference betweenthe two reference pictures.

In some situations local illumination compensation (LIC) can be used atthe CU level to find values for a scaling factor parameter and an offsetparameter, based on samples neighboring the current CU 102 andcorresponding samples neighboring a reference block identified by acandidate motion vector. In JVET, the LIC parameters can change and besignaled at the CU level.

For some of the above methods the motion vectors found for each of aCU's sub-CUs can be signaled to decoders at the CU level. For othermethods, such as PMMVD and BIO, motion information is not signaled inthe bitstream to save overhead, and decoders can derive the motionvectors through the same processes.

After the motion vectors for a CU 102 have been found, the encoder cangenerate a prediction CU 702 using those motion vectors. In some cases,when motion vectors have been found for individual sub-CUs, OverlappedBlock Motion Compensation (OBMC) can be used when generating aprediction CU 702 by combining those motion vectors with motion vectorspreviously found for one or more neighboring sub-CUs.

When bi-prediction is used, JVET can use decoder-side motion vectorrefinement (DMVR) to find motion vectors. DMVR allows a motion vector tobe found based on two motion vectors found for bi-prediction using abilateral template matching process. In DMVR, a weighted combination ofprediction CUs 702 generated with each of the two motion vectors can befound, and the two motion vectors can be refined by replacing them withnew motion vectors that best point to the combined prediction CU 702.The two refined motion vectors can be used to generate the finalprediction CU 702.

At 708, once a prediction CU 702 has been found with intra prediction at704 or inter prediction at 706 as described above, the encoder cansubtract the prediction CU 702 from the current CU 102 find a residualCU 710.

The encoder can use one or more transform operations at 712 to convertthe residual CU 710 into transform coefficients 714 that express theresidual CU 710 in a transform domain, such as using a discrete cosineblock transform (DCT-transform) to convert data into the transformdomain. JVET allows more types of transform operations than HEVC,including DCT-II, DST-VII, DST-VII, DCT-VIII, DST-I, and DCT-Voperations. The allowed transform operations can be grouped intosub-sets, and an indication of which sub-sets and which specificoperations in those sub-sets were used can be signaled by the encoder.In some cases, large block-size transforms can be used to zero out highfrequency transform coefficients in CUs 102 larger than a certain size,such that only lower-frequency transform coefficients are maintained forthose CUs 102.

In some cases a mode dependent non-separable secondary transform(MDNSST) can be applied to low frequency transform coefficients 714after a forward core transform. The MDNSST operation can use aHypercube-Givens Transform (HyGT) based on rotation data. When used, anindex value identifying a particular MDNSST operation can be signaled bythe encoder.

At 716, the encoder can quantize the transform coefficients 714 intoquantized transform coefficients 716. The quantization of eachcoefficient may be computed by dividing a value of the coefficient by aquantization step, which is derived from a quantization parameter (QP).In some embodiments, the Qstep is defined as 2^((QP−4)/6). Because highprecision transform coefficients 714 can be converted into quantizedtransform coefficients 716 with a finite number of possible values,quantization can assist with data compression. Thus, quantization of thetransform coefficients may limit an amount of bits generated and sent bythe transformation process. However, while quantization is a lossyoperation, and the loss by quantization cannot be recovered, thequantization process presents a trade-off between quality of thereconstructed sequence and an amount of information needed to representthe sequence. For example, a lower QP value can result in better qualitydecoded video, although a higher amount of data may be required forrepresentation and transmission. In contrast, a high QP value can resultin lower quality reconstructed video sequences but with lower data andbandwidth needs.

JVET can utilize variance-based adaptive quantization techniques, whichallows every CU 102 to use a different quantization parameter for itscoding process (instead of using the same frame QP in the coding ofevery CU 102 of the frame). The variance-based adaptive quantizationtechniques adaptively lowers the quantization parameter of certainblocks while increasing it in others. To select a specific QP for a CU102, the CU's variance is computed. In brief, if a CU's variance ishigher than the average variance of the frame, a higher QP than theframe's QP may be set for the CU 102. If the CU 102 presents a lowervariance than the average variance of the frame, a lower QP may beassigned.

At 720, the encoder can find final compression bits 722 by entropycoding the quantized transform coefficients 718. Entropy coding aims toremove statistical redundancies of the information to be transmitted. InJVET, CABAC (Context Adaptive Binary Arithmetic Coding) can be used tocode the quantized transform coefficients 718, which uses probabilitymeasures to remove the statistical redundancies. For CUs 102 withnon-zero quantized transform coefficients 718, the quantized transformcoefficients 718 can be converted into binary. Each bit (“bin”) of thebinary representation can then be encoded using a context model. A CU102 can be broken up into three regions, each with its own set ofcontext models to use for pixels within that region.

Multiple scan passes can be performed to encode the bins. During passesto encode the first three bins (bin0, bin1, and bin2), an index valuethat indicates which context model to use for the bin can be found byfinding the sum of that bin position in up to five previously codedneighboring quantized transform coefficients 718 identified by atemplate.

A context model can be based on probabilities of a bin's value being ‘0’or ‘1’. As values are coded, the probabilities in the context model canbe updated based on the actual number of ‘0’ and ‘1’ values encountered.While HEVC used fixed tables to re-initialize context models for eachnew picture, in JVET the probabilities of context models for newinter-predicted pictures can be initialized based on context modelsdeveloped for previously coded inter-predicted pictures.

The encoder can produce a bitstream that contains entropy encoded bits722 of residual CUs 710, prediction information such as selected intraprediction modes or motion vectors, indicators of how the CUs 102 werepartitioned from a CTU 100 according to the QTBT structure, and/or otherinformation about the encoded video. The bitstream can be decoded by adecoder as discussed below.

In addition to using the quantized transform coefficients 718 to findthe final compression bits 722, the encoder can also use the quantizedtransform coefficients 718 to generate reconstructed CUs 734 byfollowing the same decoding process that a decoder would use to generatereconstructed CUs 734. Thus, once the transformation coefficients havebeen computed and quantized by the encoder, the quantized transformcoefficients 718 may be transmitted to the decoding loop in the encoder.After quantization of a CU's transform coefficients, a decoding loopallows the encoder to generate a reconstructed CU 734 identical to theone the decoder generates in the decoding process. Accordingly, theencoder can use the same reconstructed CUs 734 that a decoder would usefor neighboring CUs 102 or reference pictures when performing intraprediction or inter prediction for a new CU 102. Reconstructed CUs 102,reconstructed slices, or full reconstructed frames may serve asreferences for further prediction stages.

At the encoder's decoding loop (and see below, for the same operationsin the decoder) to obtain pixel values for the reconstructed image, adequantization process may be performed. To dequantize a frame, forexample, a quantized value for each pixel of a frame is multiplied bythe quantization step, e.g., (Qstep) described above, to obtainreconstructed dequantized transform coefficients 726. For example, inthe decoding process shown in FIG. 7A in the encoder, the quantizedtransform coefficients 718 of a residual CU 710 can be dequantized at724 to find dequantized transform coefficients 726. If an MDNSSToperation was performed during encoding, that operation can be reversedafter dequantization.

At 728, the dequantized transform coefficients 726 can be inversetransformed to find a reconstructed residual CU 730, such as by applyinga DCT to the values to obtain the reconstructed image. At 732 thereconstructed residual CU 730 can be added to a corresponding predictionCU 702 found with intra prediction at 704 or inter prediction at 706, inorder to find a reconstructed CU 734.

At 736, one or more filters can be applied to the reconstructed dataduring the decoding process (in the encoder or, as described below, inthe decoder), at either a picture level or CU level. For example, theencoder can apply a deblocking filter, a sample adaptive offset (SAO)filter, and/or an adaptive loop filter (ALF). The encoder's decodingprocess may implement filters to estimate and transmit to a decoder theoptimal filter parameters that can address potential artifacts in thereconstructed image. Such improvements increase the objective andsubjective quality of the reconstructed video. In deblocking filtering,pixels near a sub-CU boundary may be modified, whereas in SAO, pixels ina CTU 100 may be modified using either an edge offset or band offsetclassification. WET's ALF can use filters with circularly symmetricshapes for each 2×2 block. An indication of the size and identity of thefilter used for each 2×2 block can be signaled. Alternately, in someembodiments in which weighted angular prediction is implemented for theprediction CU, alternate or no filters can be applied to thereconstructed CU.

If reconstructed pictures are reference pictures, they can be stored ina reference buffer 738 for inter prediction of future CUs 102 at 706.

During the above steps, JVET allows content adaptive clipping operationsto be used to adjust color values to fit between lower and upperclipping bounds. The clipping bounds can change for each slice, andparameters identifying the bounds can be signaled in the bitstream.

FIG. 9 depicts a simplified block diagram for CU coding in a JVETdecoder. A JVET decoder can receive a bitstream containing informationabout encoded CUs 102. The bitstream can indicate how CUs 102 of apicture were partitioned from a CTU 100 according to a QTBT structure.By way of a non-limiting example, the bitstream can identify how CUs 102were partitioned from each CTU 100 in a QTBT using quadtreepartitioning, symmetric binary partitioning, and/or asymmetric binarypartitioning. The bitstream can also indicate prediction information forthe CUs 102 such as intra prediction modes or motion vectors, and bits902 representing entropy encoded residual CUs.

At 904 the decoder can decode the entropy encoded bits 902 using theCABAC context models signaled in the bitstream by the encoder. Thedecoder can use parameters signaled by the encoder to update the contextmodels' probabilities in the same way they were updated during encoding.

After reversing the entropy encoding at 904 to find quantized transformcoefficients 906, the decoder can dequantize them at 908 to finddequantized transform coefficients 910. If an MDNSST operation wasperformed during encoding, that operation can be reversed by the decoderafter dequantization.

At 912, the dequantized transform coefficients 910 can be inversetransformed to find a reconstructed residual CU 914. At 916, thereconstructed residual CU 914 can be added to a corresponding predictionCU 926 found with intra prediction at 922 or inter prediction at 924, inorder to find a reconstructed CU 918.

Thus, in step 923 a, it can be determined whether weighted angularprediction has been employed as the intra prediction mode in step 922.If in step 923 a, the intra prediction mode is determined as usingweighted angular prediction, then the prediction coding unit 926 can bedelivered for entropy coding absent filtering. However, if in step 923a, the intra prediction mode is determined to be other than weightedangular prediction, post intra prediction filtering 923 b, such as PDPCfiltering can be applied to the prediction coding unit prior to deliveryfor entropy coding.

At 920, one or more filters can be applied to the reconstructed data, ateither a picture level or CU level. For example, the decoder can apply adeblocking filter, a sample adaptive offset (SAO) filter, and/or anadaptive loop filter (ALF). As described above, the in-loop filterslocated in the decoding loop of the encoder may be used to estimateoptimal filter parameters to increase the objective and subjectivequality of a frame. These parameters are transmitted to the decoder tofilter the reconstructed frame at 920 to match the filteredreconstructed frame in the encoder.

After reconstructed pictures have been generated by findingreconstructed CUs 918 and applying signaled filters, the decoder canoutput the reconstructed pictures as output video 928. If reconstructedpictures are to be used as reference pictures, they can be stored in areference buffer 930 for inter prediction of future CUs 102 at 924.

FIG. 10 depicts an embodiment of a method of CU coding 1000 in a JVETdecoder. In the embodiment depicted in FIG. 10, in step 1002 an encodedbitstream 902 can be received and then in step 1004 the CABAC contextmodel associated with the encoded bitstream 902 can be determined andthe encoded bitstream 902 can then be decoded using the determined CABACcontext model in step 1006.

In step 1008, the quantized transform coefficients 906 associated withthe encoded bitstream 902 can be determined and de-quantized transformcoefficients 910 can then be determined from the quantized transformcoefficients 906 in step 1010.

In step 1012, it can be determined whether an MDNSST operation wasperformed during encoding and/or if the bitstream 902 containsindications that an MDNSST operation was applied to the bitstream 902.If it is determined that an MDNSST operation was performed during theencoding process or the bitstream 902 contains indications that anMDNSST operation was applied to the bitstream 902, then an inverseMDNSST operation 1014 can be implemented before an inverse transformoperation 912 is performed on the bitstream 902 in step 1016.Alternately, an inverse transform operation 912 can be performed on thebitstream 902 in step 1016 absent application of an inverse MDNSSToperation in step 1014. The inverse transform operation 912 in step 1016can determine and/or construct a reconstructed residual CU 914.

In step 1018, the reconstructed residual CU 914 from step 1016 can becombined with a prediction CU 918. The prediction CU 918 can be one ofan intra-prediction CU 922 determined in step 1020 and aninter-prediction unit 924 determined in step 1022.

Thus, in step 1023 a, it can be determined whether weighted angularprediction has been employed as the intra prediction mode in step 1020.If in step 1023 a, the intra prediction mode is determined as usingweighted angular prediction, then the prediction coding unit 926 can bedelivered for entropy coding absent filtering and/or filtering performedin step 1024 can be modified and/or absent. However, if in step 1023 a,the intra prediction mode is determined to be other than weightedangular prediction, post intra prediction filtering 1023 b and/or atstep 1024, such as PDPC filtering can be applied to the predictioncoding unit prior to delivery for entropy coding.

As depicted in FIG. 10, in some embodiments step 1023 b can be absentand a post intra prediction filter 1024 can be employed after step 1018for all predictions. In such embodiments depicted in FIG. 10, if theintra prediction mode is based upon other than weighted angularprediction, then the filter applied can applied as it would normally beapplied in step 1024. However, if the intra prediction mode is basedupon weighted angular prediction filtering in step 1024 can be bypassedand/or in some embodiments, the filter applied can be unbiased towardthe main reference, side reference or main and side references prior tooutput of the reconstructed CU in step main reference, side reference ormain and side references prior to output of the reconstructed CU in step1026. By way of non-limiting example, the values of Cv1 and Ch1 can beequal and/or the values of Cv2 and Ch2 can be equal.

In step 1024, any one or more filters 920 can be applied to thereconstructed CU 914 and output in step 1026. In some embodimentsfilters 920 may not be applied in step 1024.

In some embodiments, in step 1028, the reconstructed CU 918 can bestored in a reference buffer 930.

FIG. 11 depicts a simplified block diagram 1100 for CU coding in a JVETencoder. In step 1102 a JVET coding tree unit can be represented as aroot node in a quadtree plus binary tree (QTBT) structure. In someembodiments the QTBT can have a quadtree branching from the root nodeand/or binary trees branching from one or more of the quadtree's leafnodes. The representation from step 1102 can proceed to step 1104, 1106or 1108.

In step 1104, asymmetric binary partitioning can be employed to split arepresented quadtree node into two blocks of unequal size. In someembodiments, the split blocks can be represented in a binary treebranching from the quadtree node as leaf nodes that can represent finalcoding units. In some embodiment, the binary tree branching from thequadtree node as leaf nodes represent final coding units in whichfurther splitting is disallowed. In some embodiments the asymmetricpartitioning can split a coding unit into blocks of unequal size, afirst representing 25% of the quadtree node and a second representing75% of the quadtree node.

In step 1106, quadtree partitioning can be employed to split arepresented quadtree node into four square blocks of equal size. In someembodiments the split blocks can be represented as quadtree nodes thatrepresent final coding units or can be represented as child nodes thatcan be split again with quadtree partitioning, symmetric binarypartitioning, or asymmetric binary partitioning.

In step 1108 quadtree partitioning can be employed to split arepresented quadtree node into two blocks of equal size. In someembodiments the split blocks can be represented as quadtree nodes thatrepresent final coding units or can be represented as child nodes thatcan be split again with quadtree partitioning, symmetric binarypartitioning, or asymmetric binary partitioning.

In step 1110, child nodes from step 1106 or step 1108 can be representedas child nodes configured to be encoded. In some embodiments the childnodes can be represented by leaf nodes of the binary tree with JVET.

In step 1112, coding units from step 1104 or 1110 can be encoded usingJVET.

FIG. 12 depicts a simplified block diagram 1200 for CU decoding in aJVET decoder. In the embodiment depicted in FIG. 12, in step 1202 abitstream indicating how a coding tree unit was partitioned into codingunits according to a QTBT structure can be received. The bitstream canindicate how quadtree nodes are split with at least one of quadtreepartitioning, symmetric binary partitioning or asymmetric binarypartitioning.

In step 1204, coding units, represented by leaf nodes of the QTBTstructure can be identified. In some embodiments, the coding units canindicate whether a node was split from a quadtree leaf node usingasymmetric binary partitioning. In some embodiments, the coding unit canindicate that the node represents a final coding unit to be decoded.

In step 1206, the identified coding unit(s) can be decoded using JVET.

FIG. 13 depicts a simplified block diagram 1300 of an increasedefficiency coding system and method. In coding and decoding systems, apredictor is generated in intra coding to exploit the correlationbetween the coding block and its neighbors. In JVET, a reference rowadjacent to the top boundary and a reference column adjacent to the leftboundary of the coding block are used in the predictor generationprocess. For each intra prediction mode, a projected neighbor positionalong a reference line for each pixel within in a PU a is determinedusing the angular direction associated with the determined intra mode.Projected neighbors along a reference column serve as a main referenceline for horizontal modes (modes 2-33) and projected neighbors along areference row serve as a main reference line for vertical modes (modes35-66). The reference column or row that is partially used in predictorgeneration is called the side reference line. As shown in FIG. 8, intraprediction modes 2 and 66 share the same prediction angle. However, mode2 uses the left neighbor as a reference, while mode 66 uses the topneighbor as the reference. Thus, improved coding efficiency can beachieved by combining these two modes (2 and 66) together so that onecodeword is to signal these two modes resulting in a reduction ofoverhead bits.

In step 1302 a coding prediction mode is determined, then in step 1304 adetermination is made as to whether the coding mode is mode 2 or mode66. If the determined coding prediction mode is other than mode 2 ormode 66, then any known, convenient and/or desired coding predictiontechnique can be employed. However, if coding mode prediction mode 2 or66 are determined, then a modified and more efficient prediction codingcan be employed.

Disclosed is an intra prediction mode that combines two intraprediction; modes 2 and 66, using one coding mode. The method 1300maintains prediction accuracy of the two intra prediction modes, 2 and66, while not significantly increasing the burden in choosing theprediction direction at both encoder and decoder. Accordingly, the newmode is able to adaptively set its predictor to follow the predictor ofone mode, instead of another, when its prediction direction providesmore accurate predictor and vice versa. In some embodiments, oneheuristic approach is to use available coding information at the decoderside to choose between the two modes (2 and 66). Various information canbe used to determine a prediction direction for the new combined mode.In some embodiments, block dimension, such as width or height, can beused as a selection criteria. In such embodiments, the predictiondirection can be chosen such that it follows the direction that haslonger boundary. However, in alternate embodiments the predictiondirection that has shorter boundary can be selected.

By way of non-limiting example, using block dimension as a selectioncriteria and prediction modes 2 and 66, a predictor pixel of weightedangular prediction at coordinate (x,y), P(x,y), can be calculated as:

P[x,y]=Recon[x+y+2,−1], when width>height; or

P[x,y]=Recon[−1,x+y+2], for alternate conditions

Where Recon[0,0] is a reconstructed pixel at top left coordinate (0,0)of the current CU.

By way of alternate, non-limiting example, a pixel difference (e.g.,variance) along the reference row and pixel difference along thereference column can be used. In such embodiments, a predictiondirection can be made to follow that direction having the smaller (orlarger) pixel difference.

In some embodiments, weighted angular prediction can generate predictorpixels for angular prediction using pixels at a projected position onboth top reference row and left reference column. For JVET mode 2 ormode 66, a predictor pixel of weighted angular prediction at coordinate(x,y), P(x,y), can be calculated as:

P[x,y]=((((x+1)*Recon[x+y+2,−1])+((y+1)*(Recon[−1,x+y+2]))+(y+x+2)/2)/(y+x+2))

Where Recon[0,0] is a reconstructed pixel at top left coordinate (0,0)of the current CU.

The system and method can be extended to support weighted angularprediction by assigning a mode index of either mode 2 or mode 66, thatis not used for weighted angular prediction. That is, if mode 2 isassigned to weighted angular prediction, then mode 66 can be assigned toany other known, convenient and/or desired prediction method. In someembodiments the opposite can be true wherein mode 66 is assigned toweighted angular prediction and mode 2 can be assigned to any otherknown, convenient and/or desired prediction method.

FIG. 14 depicts a simplified block diagram for CU coding with increasedefficiency in a JVET encoder substantially similar to that depicted anddescribed in FIGS. 7A and 7B. FIG. 14 depicts a system and methodfurther comprising steps 1402 1404 and 1406 wherein in step 1402 adetermination is made regarding whether intra prediction modes 2 or 66are employed. Then in step 1404 standard/known and/or convenientprediction coding can be employed and in step 1406 a selected modifiedprediction coding can be implemented for prediction modes, as describedabove in relation to FIG. 13 for weighted or non-weighted angularprediction and in step 1406 after a determination regarding whetherweighted or non-weighted angular prediction is determined in step 705 a.That is, the new mode is able to adaptively set its predictor to followthe predictor of one mode, instead of another, when its predictiondirection provides more accurate predictor and vice versa. In someembodiments, one heuristic approach is to use available codinginformation at the decoder side to choose between the two modes (2 and66). Various information can be used to determine a prediction directionfor the new combined mode. In some embodiments, block dimension, such aswidth or height, can be used as a selection criteria. In suchembodiments, the prediction direction can be chosen such that it followsthe direction that has longer boundary. However, in alternateembodiments the prediction direction that has shorter boundary can beselected.

In alternate embodiments, it will be readily apparent to those ofordinary skill in the art that the post filtering of step 705 b (shownin FIGS. 7A and 7B) can be implemented concurrently within the systemand method depicted and described in relation to FIGS. 7A and 7B.

FIG. 15 depicts a simplified block diagram for CU decoding withincreased efficiency in a JVET decoder. FIG. 15 depicts a system andmethod further comprising steps 1402 1404 and 1406 wherein in step 1402a determination is made regarding whether intra prediction modes 2 or 66are employed. Then in step 1404 standard/known and/or convenientprediction coding can be employed and in step 1406 a selected modifiedprediction coding can be implemented for prediction modes, as describedabove in relation to FIG. 13 for weighted or non-weighted angularprediction and in step 1406 after a determination regarding whetherweighted or non-weighted angular prediction is determined in step 923 a.

In alternate embodiments, it will be readily apparent to those ofordinary skill in the art that the post filtering of step 923 b can beimplemented concurrently within the system and method depicted anddescribed in relation to FIG. 9.

The execution of the sequences of instructions required to practice theembodiments can be performed by a computer system 1600 as shown in FIG.16. In an embodiment, execution of the sequences of instructions isperformed by a single computer system 1600. According to otherembodiments, two or more computer systems 1600 coupled by acommunication link 1615 can perform the sequence of instructions incoordination with one another. Although a description of only onecomputer system 1600 will be presented below, however, it should beunderstood that any number of computer systems 1600 can be employed topractice the embodiments.

A computer system 1600 according to an embodiment will now be describedwith reference to FIG. 16, which is a block diagram of the functionalcomponents of a computer system 1300. As used herein, the term computersystem 1600 is broadly used to describe any computing device that canstore and independently run one or more programs.

Each computer system 1600 can include a communication interface 1614coupled to the bus 1606. The communication interface 1614 providestwo-way communication between computer systems 1600. The communicationinterface 1614 of a respective computer system 1600 transmits andreceives electrical, electromagnetic or optical signals that includedata streams representing various types of signal information, e.g.,instructions, messages and data. A communication link 1615 links onecomputer system 1600 with another computer system 1600. For example, thecommunication link 1615 can be a LAN, in which case the communicationinterface 1614 can be a LAN card, or the communication link 1615 can bea PSTN, in which case the communication interface 1614 can be anintegrated services digital network (ISDN) card or a modem, or thecommunication link 1615 can be the Internet, in which case thecommunication interface 1614 can be a dial-up, cable or wireless modem.

A computer system 1600 can transmit and receive messages, data, andinstructions, including program, i.e., application, code, through itsrespective communication link 1615 and communication interface 1614.Received program code can be executed by the respective processor(s)1607 as it is received, and/or stored in the storage device 1610, orother associated non-volatile media, for later execution.

In an embodiment, the computer system 1600 operates in conjunction witha data storage system 1631, e.g., a data storage system 1631 thatcontains a database 1632 that is readily accessible by the computersystem 1600. The computer system 1600 communicates with the data storagesystem 1631 through a data interface 1633. A data interface 1633, whichis coupled to the bus 1606, transmits and receives electrical,electromagnetic or optical signals, that include data streamsrepresenting various types of signal information, e.g., instructions,messages and data. In embodiments, the functions of the data interface1633 can be performed by the communication interface 1614.

Computer system 1600 includes a bus 1606 or other communicationmechanism for communicating instructions, messages and data,collectively, information, and one or more processors 1607 coupled withthe bus 1606 for processing information. Computer system 1600 alsoincludes a main memory 1608, such as a random access memory (RAM) orother dynamic storage device, coupled to the bus 1606 for storingdynamic data and instructions to be executed by the processor(s) 1607.The main memory 1608 also can be used for storing temporary data, i.e.,variables, or other intermediate information during execution ofinstructions by the processor(s) 1607.

The computer system 1600 can further include a read only memory (ROM)1609 or other static storage device coupled to the bus 1606 for storingstatic data and instructions for the processor(s) 1607. A storage device1610, such as a magnetic disk or optical disk, can also be provided andcoupled to the bus 1606 for storing data and instructions for theprocessor(s) 1607.

A computer system 1600 can be coupled via the bus 1606 to a displaydevice 1611, such as, but not limited to, a cathode ray tube (CRT) or aliquid-crystal display (LCD) monitor, for displaying information to auser. An input device 1612, e.g., alphanumeric and other keys, iscoupled to the bus 1606 for communicating information and commandselections to the processor(s) 1607.

According to one embodiment, an individual computer system 1600 performsspecific operations by their respective processor(s) 1607 executing oneor more sequences of one or more instructions contained in the mainmemory 1608. Such instructions can be read into the main memory 1608from another computer-usable medium, such as the ROM 1609 or the storagedevice 1610. Execution of the sequences of instructions contained in themain memory 1608 causes the processor(s) 1607 to perform the processesdescribed herein. In alternative embodiments, hard-wired circuitry canbe used in place of or in combination with software instructions. Thus,embodiments are not limited to any specific combination of hardwarecircuitry and/or software.

The term “computer-usable medium,” as used herein, refers to any mediumthat provides information or is usable by the processor(s) 1607. Such amedium can take many forms, including, but not limited to, non-volatile,volatile and transmission media. Non-volatile media, i.e., media thatcan retain information in the absence of power, includes the ROM 1309,CD ROM, magnetic tape, and magnetic discs. Volatile media, i.e., mediathat can not retain information in the absence of power, includes themain memory 1608. Transmission media includes coaxial cables, copperwire and fiber optics, including the wires that comprise the bus 1606.Transmission media can also take the form of carrier waves; i.e.,electromagnetic waves that can be modulated, as in frequency, amplitudeor phase, to transmit information signals. Additionally, transmissionmedia can take the form of acoustic or light waves, such as thosegenerated during radio wave and infrared data communications.

In the foregoing specification, the embodiments have been described withreference to specific elements thereof. It will, however, be evidentthat various modifications and changes can be made thereto withoutdeparting from the broader spirit and scope of the embodiments. Forexample, the reader is to understand that the specific ordering andcombination of process actions shown in the process flow diagramsdescribed herein is merely illustrative, and that using different oradditional process actions, or a different combination or ordering ofprocess actions can be used to enact the embodiments. The specificationand drawings are, accordingly, to be regarded in an illustrative ratherthan restrictive sense.

It should also be noted that the present invention can be implemented ina variety of computer systems. The various techniques described hereincan be implemented in hardware or software, or a combination of both.Preferably, the techniques are implemented in computer programsexecuting on programmable computers that each include a processor, astorage medium readable by the processor (including volatile andnon-volatile memory and/or storage elements), at least one input device,and at least one output device. Program code is applied to data enteredusing the input device to perform the functions described above and togenerate output information. The output information is applied to one ormore output devices. Each program is preferably implemented in a highlevel procedural or object oriented programming language to communicatewith a computer system. However, the programs can be implemented inassembly or machine language, if desired. In any case, the language canbe a compiled or interpreted language. Each such computer program ispreferably stored on a storage medium or device (e.g., ROM or magneticdisk) that is readable by a general or special purpose programmablecomputer for configuring and operating the computer when the storagemedium or device is read by the computer to perform the proceduresdescribed above. The system can also be considered to be implemented asa computer-readable storage medium, configured with a computer program,where the storage medium so configured causes a computer to operate in aspecific and predefined manner. Further, the storage elements of theexemplary computing applications can be relational or sequential (flatfile) type computing databases that are capable of storing data invarious combinations and configurations.

FIG. 17 is a high level view of a source device 1712 and destinationdevice 1710 that may incorporate features of the systems and devicesdescribed herein. As shown in FIG. 17, example video coding system 1710includes a source device 1712 and a destination device 1714 where, inthis example, the source device 1712 generates encoded video data.Accordingly, source device 1712 may be referred to as a video encodingdevice. Destination device 1714 may decode the encoded video datagenerated by source device 1712. Accordingly, destination device 1714may be referred to as a video decoding device. Source device 1712 anddestination device 1714 may be examples of video coding devices.

Destination device 1714 may receive encoded video data from sourcedevice 1712 via a channel 1716. Channel 1716 may comprise a type ofmedium or device capable of moving the encoded video data from sourcedevice 1712 to destination device 1714. In one example, channel 1716 maycomprise a communication medium that enables source device 1712 totransmit encoded video data directly to destination device 1714 inreal-time.

In this example, source device 1712 may modulate the encoded video dataaccording to a communication standard, such as a wireless communicationprotocol, and may transmit the modulated video data to destinationdevice 1714. The communication medium may comprise a wireless or wiredcommunication medium, such as a radio frequency (RF) spectrum or one ormore physical transmission lines. The communication medium may form partof a packet-based network, such as a local area network, a wide-areanetwork, or a global network such as the Internet. The communicationmedium may include routers, switches, base stations, or other equipmentthat facilitates communication from source device 1712 to destinationdevice 1714. In another example, channel 1716 may correspond to astorage medium that stores the encoded video data generated by sourcedevice 1712.

In the example of FIG. 17, source device 1712 includes a video source1718, video encoder 1720, and an output interface 1722. In some cases,output interface 1728 may include a modulator/demodulator (modem) and/ora transmitter. In source device 1712, video source 1718 may include asource such as a video capture device, e.g., a video camera, a videoarchive containing previously captured video data, a video feedinterface to receive video data from a video content provider, and/or acomputer graphics system for generating video data, or a combination ofsuch sources.

Video encoder 1720 may encode the captured, pre-captured, orcomputer-generated video data. An input image may be received by thevideo encoder 1720 and stored in the input frame memory 1721. Thegeneral purpose processor 1723 may load information from here andperform encoding. The program for driving the general purpose processormay be loaded from a storage device, such as the example memory modulesdepicted in FIG. 17. The general purpose processor may use processingmemory 1722 to perform the encoding, and the output of the encodinginformation by the general processor may be stored in a buffer, such asoutput buffer 1726.

The video encoder 1720 may include a resampling module 1725 which may beconfigured to code (e.g., encode) video data in a scalable video codingscheme that defines at least one base layer and at least one enhancementlayer. Resampling module 1725 may resample at least some video data aspart of an encoding process, wherein resampling may be performed in anadaptive manner using resampling filters.

The encoded video data, e.g., a coded bit stream, may be transmitteddirectly to destination device 1714 via output interface 1728 of sourcedevice 1712. In the example of FIG. 17, destination device 1714 includesan input interface 1738, a video decoder 1730, and a display device1732. In some cases, input interface 1728 may include a receiver and/ora modem. Input interface 1738 of destination device 1714 receivesencoded video data over channel 1716. The encoded video data may includea variety of syntax elements generated by video encoder 1720 thatrepresent the video data. Such syntax elements may be included with theencoded video data transmitted on a communication medium, stored on astorage medium, or stored a file server.

The encoded video data may also be stored onto a storage medium or afile server for later access by destination device 1714 for decodingand/or playback. For example, the coded bitstream may be temporarilystored in the input buffer 1731, then loaded in to the general purposeprocessor 1733. The program for driving the general purpose processormay be loaded from a storage device or memory. The general purposeprocessor may use a process memory 1732 to perform the decoding. Thevideo decoder 1730 may also include a resampling module 1735 similar tothe resampling module 1725 employed in the video encoder 1720.

FIG. 17 depicts the resampling module 1735 separately from the generalpurpose processor 1733, but it would be appreciated by one of skill inthe art that the resampling function may be performed by a programexecuted by the general purpose processor, and the processing in thevideo encoder may be accomplished using one or more processors. Thedecoded image(s) may be stored in the output frame buffer 1736 and thensent out to the input interface 1738.

Display device 1738 may be integrated with or may be external todestination device 1714. In some examples, destination device 1714 mayinclude an integrated display device and may also be configured tointerface with an external display device. In other examples,destination device 1714 may be a display device. In general, displaydevice 1738 displays the decoded video data to a user.

Video encoder 1720 and video decoder 1730 may operate according to avideo compression standard. ITU-T VCEG (Q6/16) and ISO/IEC MPEG (JTC1/SC 29/WG 11) are studying the potential need for standardization offuture video coding technology with a compression capability thatsignificantly exceeds that of the current High Efficiency Video CodingHEVC standard (including its current extensions and near-term extensionsfor screen content coding and high-dynamic-range coding). The groups areworking together on this exploration activity in a joint collaborationeffort known as the Joint Video Exploration Team (WET) to evaluatecompression technology designs proposed by their experts in this area. Arecent capture of WET development is described in the “AlgorithmDescription of Joint Exploration Test Model 5 (JEM 5)”, WET-E1001-V2,authored by J. Chen, E. Alshina, G. Sullivan, J. Ohm, J. Boyce.

Additionally or alternatively, video encoder 1720 and video decoder 1730may operate according to other proprietary or industry standards thatfunction with the disclosed JVET features. Thus, other standards such asthe ITU-T H.264 standard, alternatively referred to as MPEG-4, Part 10,Advanced Video Coding (AVC), or extensions of such standards. Thus,while newly developed for JVET, techniques of this disclosure are notlimited to any particular coding standard or technique. Other examplesof video compression standards and techniques include MPEG-2, ITU-TH.263 and proprietary or open source compression formats and relatedformats.

Video encoder 1720 and video decoder 1730 may be implemented inhardware, software, firmware or any combination thereof. For example,the video encoder 1720 and decoder 1730 may employ one or moreprocessors, digital signal processors (DSPs), application specificintegrated circuits (ASICs), field programmable gate arrays (FPGAs),discrete logic, or any combinations thereof. When the video encoder 1720and decoder 1730 are implemented partially in software, a device maystore instructions for the software in a suitable, non-transitorycomputer-readable storage medium and may execute the instructions inhardware using one or more processors to perform the techniques of thisdisclosure. Each of video encoder 1720 and video decoder 1730 may beincluded in one or more encoders or decoders, either of which may beintegrated as part of a combined encoder/decoder (CODEC) in a respectivedevice.

Aspects of the subject matter described herein may be described in thegeneral context of computer-executable instructions, such as programmodules, being executed by a computer, such as the general purposeprocessors 1723 and 1733 described above. Generally, program modulesinclude routines, programs, objects, components, data structures, and soforth, which perform particular tasks or implement particular abstractdata types. Aspects of the subject matter described herein may also bepracticed in distributed computing environments where tasks areperformed by remote processing devices that are linked through acommunications network. In a distributed computing environment, programmodules may be located in both local and remote computer storage mediaincluding memory storage devices.

Examples of memory include random access memory (RAM), read only memory(ROM), or both. Memory may store instructions, such as source code orbinary code, for performing the techniques described above. Memory mayalso be used for storing variables or other intermediate informationduring execution of instructions to be executed by a processor, such asprocessor 1723 and 1733.

A storage device may also store instructions, instructions, such assource code or binary code, for performing the techniques describedabove. A storage device may additionally store data used and manipulatedby the computer processor. For example, a storage device in a videoencoder 1720 or a video decoder 1730 may be a database that is accessedby computer system 1723 or 1733. Other examples of storage deviceinclude random access memory (RAM), read only memory (ROM), a harddrive, a magnetic disk, an optical disk, a CD-ROM, a DVD, a flashmemory, a USB memory card, or any other medium from which a computer canread.

A memory or storage device may be an example of a non-transitorycomputer-readable storage medium for use by or in connection with thevideo encoder and/or decoder. The non-transitory computer-readablestorage medium contains instructions for controlling a computer systemto be configured to perform functions described by particularembodiments. The instructions, when executed by one or more computerprocessors, may be configured to perform that which is described inparticular embodiments.

Also, it is noted that some embodiments have been described as a processwhich can be depicted as a flow diagram or block diagram. Although eachmay describe the operations as a sequential process, many of theoperations can be performed in parallel or concurrently. In addition,the order of the operations may be rearranged. A process may haveadditional steps not included in the figures.

Particular embodiments may be implemented in a non-transitorycomputer-readable storage medium for use by or in connection with theinstruction execution system, apparatus, system, or machine. Thecomputer-readable storage medium contains instructions for controlling acomputer system to perform a method described by particular embodiments.The computer system may include one or more computing devices. Theinstructions, when executed by one or more computer processors, may beconfigured to perform that which is described in particular embodiments

As used in the description herein and throughout the claims that follow,“a”, “an”, and “the” includes plural references unless the contextclearly dictates otherwise. Also, as used in the description herein andthroughout the claims that follow, the meaning of “in” includes “in” and“on” unless the context clearly dictates otherwise.

Although exemplary embodiments of the invention have been described indetail and in language specific to structural features and/ormethodological acts above, it is to be understood that those skilled inthe art will readily appreciate that many additional modifications arepossible in the exemplary embodiments without materially departing fromthe novel teachings and advantages of the invention. Moreover, it is tobe understood that the subject matter defined in the appended claims isnot necessarily limited to the specific features or acts describedabove. Accordingly, these and all such modifications are intended to beincluded within the scope of this invention construed in breadth andscope in accordance with the appended claims.

1-16. (canceled)
 17. A method of decoding video with a decoder,comprising: (a) receiving a coding unit (CU) within a coding area of avideo frame having CU x and CU y coordinates; (b) receiving a predictionmode for said coding area of said video frame; (c) wherein saidprediction mode for said coding area of said video frame is coded usingthe same codeword that is used for a plurality of different predictionmodes; (d) wherein each of said plurality of different prediction modeshaving said same codeword is differentiated based at least in part on aprediction direction; (e) decoding said video frame based upon saidprediction CU.
 18. The method of decoding video of claim 17 wherein saidprediction direction is based upon any characteristic of said codingunit.
 19. The method of decoding video of claim 18 wherein saidprediction CU is entropy decoded.
 20. The method of decoding video ofclaim 18 wherein said prediction mode is based at least in part on awidth of said coding unit.
 21. The method of decoding video of claim 20wherein said prediction mode is based at least in part on a height ofsaid coding unit.
 22. The method of decoding video of claim 18 whereinsaid prediction direction is based at least in part on a height of saidcoding unit.
 23. The method of decoding video of claim 22 wherein saidprediction direction is based at least in part on a width of said codingunit.